-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Small scan-handler fixes #16721
Small scan-handler fixes #16721
Conversation
We don't yet expose the metadata in pylibcudf, so we can't handle this correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving with a small question - it may have already been considered.
@@ -222,6 +222,8 @@ def __post_init__(self) -> None: | |||
raise NotImplementedError( | |||
"Read from cloud storage" | |||
) # pragma: no cover; no test yet | |||
if any(p.startswith("https://") for p in self.paths): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about http://
or other protocols?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only way this occurs is because the user wrote scan_foo("hf://some_path/")
and it is expanded before we see it into https://huggingface..../some_path
.
So I could tighten up to match the full path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a broader brush to catch other URL-like things too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably fine. If no https:// URLs are supported then let’s not make it specific to HuggingFace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving assuming Bradley is happy (seems like he is based on the open thread and the pre-approval).
Yes, I’m happy. |
Description
Reject two more edge cases that we do not support.
We could easily support the case where the parquet read just needs to read the metadata, but it is low priority, so have not done so here.
Checklist